.. _wps_design_guide:

WPS design guide
================

This guide serves as an introduction to the WPS module. As such, it does not contain:

*  a primer to the WPS protocol, that can be found in the `WPS specification <http://www.opengeospatial.org/standards/wps>`_ (the module implements the WPS 1.0 specification).
*  it does not repeat again what can be already found in the classes javadocs
*  it does not explain how to implement a OWS service using the GeoServer OWS framework, that is left to its dedicated :ref:`guide <ows_services>`.

In short, it provides a global vision of how the module fits together, leaving the details to other information sources.


General architecture
--------------------

.. note:: We really need to publish the Javadocs somewhere so that this document can link to them

The module is based on the usual GeoServer OWS framework application:

*  a set of KVP parsers and KVP readers to parse the HTTP GET requests, found in the ``org.geoserver.wps.kvp`` package
*  a set of XML parsers to parse the HTTP POST requests, found int the ``org.geoserver.wps.xml`` and
   ``org.geoserver.wps.xml.v1_0_0``
*  a service object interface and implementations responding to the various WPS methods, in particular ``org.geoserver.wps.DefaultWebProcessingService``, which in turn delegates most of the work to the ``GetCapabilities``, ``DescribeProcess`` and ``ExecuteProcess`` classes
*  a set of output transformers taking the results generated by ``DefaultWebProcessingService`` and turning them into the appropriate response (usually, XML). You can find some of those in the ``org.geoserver.wps.response`` package, whilst some others are generic ones that have been parametrized and declared in the Spring context (see the ``applicationContext.xml`` file).

The module uses extensively the following GeoTools modules:

*  ``net.opengis.wps`` which contains EMF models of the various elements and types described in the WPS schemas. Those objects are usually what flows between the KVP parsers, XML decoders, the service implementation, and the output transformers 
*  ``gt-xsd-wps`` and ``gt-xsd``, used for all XML encoding and decoding needs 
*  ``gt-process`` that provides the concept of a process, with the ability to self describe its inputs and outputs, and of course execute and produce results

The processes
-------------

The module relies on ``gt-process`` SPI based plugin mechanism to lookup and use the processes available in the classpath. Implementing a new process boils down to:
 
* creating a ``ProcessFactory`` implementation
* creating one or more ``Process`` implementations
* registering the ``ProcessFactory`` in SPI by adding the factory class name in the ``META-INF/services/org.geotools.process.ProcessFactory`` file

The WPS module shows an example of the above by bridging the Sextante API to the GeoTools process one, see the ``org.geoserver.wps.sextante`` package.
This also means it's possible to rely on libraries of existing processes provided they are wrapped into a GeoTools process API container.

An alternative way of implementing a custom WPS process, based on Java Annotations, is described in the :ref:`wps_services_implementing` section.

Bridging between objects and I/O formats
-------------------------------------------------------------------

The WPS specification is very generic. Any process can take as input pretty much anything, and return anything. It basically means WPS is a complex, XML based RPC protocol.

Now, this means WPS can trade vector data, raster data, plain strings and numbers, spreadsheets or word processor and whatever else the imagination can lead one to.
Also, given a single type of data, say a plain geometry, there are many useful ways to represent it: it could be GML2, or GML3, or WKT, WKB, or a one row shapefile. Different clients will find some formats easier than others to use, meaning the WPS should try to offer as many option as possible for both input and output.

The classes stored in the ``org.geoserver.wps.ppio`` serve exactly this purpose: turning a representation format into an in memory object and vice versa. A new subclass of ``ProcessParameterIO`` (PPIO) is needed each time a new format for a known parameter type is desired, or when a process requires a new kind of parameter, and it then needs to be registered in the Spring contex so that ``ProcessParameterIO.find(Parameter, ApplicationContext)`` can find it.

Both the XML reader and the XML encoders do use the PPIO dynamically: the WPS document structure 
is made so that parameters are actually xs:Any, so bot

The code providing the description of the various processes also scans the available ``ProcessParameterIO`` implementations so that each parameter can be matched with all formats in which it can be represented.

Filtering processes
-------------------

By default GeoServer will publish every process found in SPI or registered in the Spring context.

The ``org.geoserver.wps.process.ProcessFilter`` interface can be implemented to exert some control
over how the processes are getting published. The interface looks as follow:

.. code-block:: java

	public interface ProcessFilter {
	    ProcessFactory filterFactory(ProcessFactory pf);
	}
	
An implementation of ProcessFilter can decide to return null to the ``filterFactory`` call in order
to have all the processes inside such factory be hidden from the user, or to wrap the factory so
that some of its functionality is changed. By wrapping a factory the following could be achieved:

* Selectively hide some process
* Change the process metadata, such as its title and description, and eventually add more translations
  of the process metadata
* Hide some of the process inputs and outputs, eventually defaulting them to a constant value
* Exert control over the process inputs, eventually refusing to run the process under certain circumstances 

For the common case of mere process selection a base class is provided, ``org.geoserver.wps.process.ProcessSelector``,
where the subclasses only have to double check if a certain process, specified by ``Name`` is allowed
to be exposed or not.

The GeoServer code base provides (by default) two implementations of a ``ProcessFilter``:

* ``org.geoserver.wps.UnsupportedParameterTypeProcessFilter``, which hides all the processes having an input or
  an output that the available ``ProcessParameterIO`` classes cannot handle
* ``org.geoserver.wps.DisabledProcessSelector``, which hides all the processes that the administrator
  disabled in the WPS Admin page in the administration console 

Once the ProcessFilter is coded it can be activated by declaring it in the Spring application context, 
for example the ``ProcessSelector`` subclass that controls which processes can be exposed based on
the WPS admin panel configuration is registered in ``applicationContext.xml`` as follows:

.. code-block:: xml

    <!-- The default process filters -->
    <bean id="unsupportedParameterTypeProcessFilter" class="org.geoserver.wps.UnsupportedParameterTypeProcessFilter"/>
    <bean id="configuredProcessesFilter" class="org.geoserver.wps.DisabledProcessesSelector"/>

Implementation level
--------------------

At the moment the WPS is pretty much bare bones protocol wise, it implements only the required behaviour leaving off pretty much everything else. In particulat:
- ``GetCapabilities`` and ``DescribeProcess`` are supported in both GET and POST form, but ``Execute`` is implemented only as a POST request
- there is no raster data I/O support
- there is no asynchronous support, no process monitoring, no output storage abilities. 
- there is no integration whatsoever with the WMS to visualize the results of an analysis (this will require output storage and per session catalog extensions)
- the vector processes are not using any kind of disk buffering, meaning everything is kept just in memory (won't scale to bigger data amounts)
- there is no set of demo requests nor a GUI to build a request. That is considered fundamental to reduce the time spent trying to figure out how to build a proper request so it will be tackled sooner rather than later.


The transmute package
----------------------

The ``org.geoserver.wps.transmute`` package is an earlier attempt at doing what PPIO is doing.
It is attempting to also provide a custom schema for each type of input/output, using subsetted schemas that do only contain one type (e.g., GML Point) but that has to reference the full schema
definition anyways.

.. note:: This package is a leftover, should be completely removed and replaced with PPIO usage instead. At the moment only the ``DescribeProcess`` code is using it.
